Skip to main content

Distributions

Discrete Distributions

(The BERNOULLIE DISTRIBUTION): P(X=x)=θx(1θ)1x;x=0,1P(X = x) = \theta^x(1-\theta)^{1-x}; x = 0, 1

(The BINOMIAL DISTRIBUTION): P(X=x)=(nx) θx(1θ)nx;x=0,1,,nP(X = x) = {n\choose x} \ \theta^x(1-\theta)^{n-x}; x = 0, 1,\ldots, n

(The GEOMETRIC DISTRIBUTION): P(X=x)=θ(1θ)x1;x=1,2,3,P(X = x) = \theta(1-\theta)^{x - 1}; x = 1, 2, 3,\ldots

  • Let k=x1,PX(k)=θ(1θ)k,k=0,1,2,k = x- 1, P_X(k) = \theta(1-\theta)^k, k = 0, 1,2,\ldots

(The NEGATIVE BINOMIAL DISTRIBUTION): P(Y=k)=(r1+kr1) θr(1θ)k;k=0,1,P(Y = k) = {r - 1+ k\choose r-1} \ \theta^r(1-\theta)^k; k = 0, 1,\ldots

(The POISSON DISTRIBUTION): P(X=x)=(nx)(λn)x(1λn)nx=λxx!exp(λ),x=0,1,2,P(X = x) = {n\choose x} (\frac{\lambda}{n})^x(1-\frac{\lambda}{n})^{n-x} = \frac{\lambda^x}{x!}\exp(-\lambda), x = 0, 1,2,\ldots

  • Sum of independent Poisson random variable is a random variable

(The HYPERGEOMETRIC DISTRIBUTION): P(X=x)=((Mx) (NMnx))/(Nn);xmax(0,n(NM))& xmin(M,n)P(X=x) = ({M \choose x} \ {N-M \choose n-x})/ {N \choose n}; x \ge \max(0, n - (N-M)) \& \ x \le \min(M,n)

Continuous Distributions

(The UNIFORM DISTRIBUTION): f(x)={1RLLxR0Otherwisef(x) = \begin{cases} \frac{1}{R-L} & L\le x\le R\\ 0 & \text{Otherwise} \end{cases}

(The NORMAL(μ,σ2\mu, \sigma^2) DISTRIBUTION PDF): f(x)=1σ2πexp((xμ)22σ2)f(x) = \frac{1}{\sigma \sqrt{2\pi}} \exp(-\frac{(x - \mu)^2}{2 \sigma^2}) where μ\mu is mean, σ2\sigma^2 is the variance.

  • μ=0,σ=1\mu = 0, \sigma = 1 is the Standard Normal distribution
  • exp((xμ)22σ2)=2πσ\int_{-\infty}^{\infty}\exp(-\frac{(x - \mu)^2}{2 \sigma^2}) = \sqrt{2\pi} \sigma
  • U=i=1naiXiU = \sum_{i = 1}^n a_iX_i and V=i=1nbiXiV = \sum_{i = 1}^n b_iX_i, Cov(U,V)=i=1naibiσi2Cov(U,V) = \sum_{i=1}^na_ib_i\sigma_i^2. For normal distribution only , Cor(U,V)=0    U,VCor(U,V) = 0\implies U,V are independent
  • the sum of normal distribution still is normal, where iXiN(iμi,iσi2)\sum_i X_i \sim N(\sum_i \mu_i, \sum_i \sigma_i^2)

(The EXPONENTIAL DISTRIBUTION): f(x)={λexp(λx)x00x<0f(x) = \begin{cases} \lambda \exp(-\lambda x) & x\ge 0 \\ 0 & x < 0 \end{cases}

(The GAMMA DISTRIBUTION): f(x)=λαxα1Γ(α)exp(λx);α,λ>0f(x) = \frac{\lambda^{\alpha}x^{\alpha - 1}}{\Gamma(\alpha)} \exp(-\lambda x); \alpha,\lambda > 0

  • Γ(α)=0tα1etdt,α>0\Gamma(\alpha) = \int_{0}^{\infty} t^{\alpha - 1} e^{-t}dt, \alpha > 0
  • Γ(α+1)=αΓ(α)\Gamma(\alpha + 1) = \alpha\Gamma(\alpha), that is, nN,Γ(n)=(n1)!\forall n\in \N, \Gamma(n) = (n - 1)!
  • Γ(1/2)=π\Gamma(1/2) = \sqrt{\pi}

Cumulative Distribution Function(CDF) is the function to interpret subset like (,x](-\infty, x] which write as FX:R[0,1]F_X: \R \to [0,1], which defined FX(x)=P(Xx)=xf(y)dyF_X(x) = P(X\le x) =\int_{-\infty} ^x f(y)dy

  • According the definition we find that P(aXb)=FX(b)FX(a)P(a\le X \le b) = F_X(b) - F_X(a)
  • limnFX(x)=0,limnFX(x)=1\lim\limits_{n\to -\infty} F_X(x) = 0, \lim\limits_{n\to \infty} F_X(x) = 1

Joint Distributions

(Multinomial Distribution): f(x1,,xk;n;p1,,pk)={n!x1!xk!p1x1×pkxk,wheni=1kxi=n0otf(x_1,\ldots, x_k; n; p_1,\ldots,p_k) = \begin{cases}\frac{n!}{x_1!\cdots x_k!}p_1^{x_1}\times\cdots p_k^{x_k}, & \text{when} \sum_{i=1}^k x_i = n \\ 0 & \text{ot} \end{cases}

(The BETA DISTRIBUTION): f(x)=1B(α,β)xα1(1x)β1=Γ(α+β)Γ(α)Γ(β)xα1(1x)β1f(x) = \frac{1}{B(\alpha, \beta) }x^{\alpha - 1}(1-x)^{\beta - 1} = \frac{\Gamma(\alpha + \beta)}{\Gamma(\alpha)\Gamma(\beta)} x^{\alpha - 1}(1-x)^{\beta - 1}

  • Beta function: B(α,β)=01tα1(1t)β1dt=Γ(α)Γ(β)Γ(α+β)B(\alpha, \beta) = \int_0 ^1 t^{\alpha - 1}(1-t)^{\beta -1}dt = \frac{\Gamma(\alpha)\Gamma(\beta)}{\Gamma(\alpha + \beta)} and can be extended to even more

(The BIVARIATE NORMAL DISTRIBUTION): Let XX, YY be random variables with Normal distribution, means μ1,μ2\mu_1, \mu_2 respectively, and variances σ12,σ22\sigma^2_1, \sigma^2_2. Let pp be their correlation defined as 1p1-1 \le p \le 1, that is, we have a specific joint distribution fX,Y(x,y)=12πσ1σ21p2×exp(12(1p2)[(xμ1σ1)2+(yμ2σ2)22p(xμ1σ1)(yμ2σ2)])f_{X,Y}(x,y) = \frac{1}{2\pi\sigma_1\sigma_2\sqrt{1-p^2}} \times \exp(-\frac{1}{2(1-p^2)}[(\frac{x-\mu_1}{\sigma_1})^2 + (\frac{y-\mu_2}{\sigma_2})^2 - 2p(\frac{x-\mu_1}{\sigma_1})(\frac{y-\mu_2}{\sigma_2})])

Conditional Distribution of Random Variable

Let XX and YY be jointly absolutely continuous random variables with the joint density function fX,Y(x,y)f_{X,Y}(x,y). The conditional density of YY given that X=xX = x is the function fYX(yx)=fX,Y(x,y)fX(x)f_{Y|X}(y|x) = \frac{f_{X,Y}(x,y)}{f_X(x)}

Sampling Distribution

Sample distribution is a probability distribution on sample statistics. More formally, let Y=h(x1,,hn)Y = h(x_1, \ldots, h_n) be any function. The probability distribution of YY is called a Sampling Distribution.

Stand error of Normal Distribution among samples SD=σ/nSD = \sigma/\sqrt{n}

(Chi-Square Distribution): let ZiN(0,1),Y=g(Zi)=i=1kZi2Z_i\sim N(0,1), Y=g(Z_i) = \sum_{i = 1}^kZ_i^2 where ZiZ_i are independent to each other then YX2(k)Y \sim \mathcal{X}^2(k)

  • k is the degree of freedom. YGamma(k/2,k/2)Y\sim \text{Gamma}(k/2,k/2)
  • Let Xˉ\bar X be the mean and s2=i=1n(XiXˉ)2n1=TSS/(n1)s^2 = \frac{\sum_{i=1}^n(X_i - \bar X)^2}{n-1} = TSS/(n-1) is the sample variance where Xii.i.dN(μ,σ2)X_i \overset{\text{i.i.d}}\sim N(\mu, \sigma^2). Then (n1)s2/σ2X2(n1)(n-1)s^2/\sigma^2 \sim \mathcal{X}^2(n-1) . Furthermore, mean and sample variance are independent.
  • E(Y)=k,Var(Y)=2kE(Y) = k, Var(Y)= 2k

(T-Distribution): Let X,X1,,Xni.i.dN(0,1)X, X_1, \ldots, X_n \overset{\text{i.i.d}}\sim N(0,1) so that we have Y=i=1nXi2,YX2(n)Y = \sum_{i=1}^nX_i^2, Y\sim\mathcal{X}^2(n). We define X/Y/ntnX/\sqrt{Y/n} \sim t_n. Let U=X/Y/nU = X/\sqrt{Y/n}, then we have the PDF fU(u)=Γ((n+1)/2)πΓ(n/2)(1+u2/n)(n+1)/2n1/2f_U(u)= \frac{\Gamma((n+1)/2)}{\sqrt{\pi} \Gamma(n/2)} (1 + u^2/n)^{-(n+1)/2} n^{-1/2}

  • limntnN(0,1)\lim\limits_{n\to \infty} t_n \to N(0,1)
  • n=1    n = 1 \implies Cauchy Distribution

(F-Distribution): Let X1,,Xmi.i.dN(0,1)X_1, \ldots, X_m \overset{\text{i.i.d}}\sim N(0,1) and let Y1,,Yni.i.dN(0,1)Y_1, \ldots, Y_n \overset{\text{i.i.d}}\sim N(0,1). Let Zx=i=1mXi2,Zy=i=1nYi2Z_x = \sum_{i=1}^m X_i^2, Z_y = \sum_{i=1}^n Y_i^2. we define Zx/mZy/nF(m,n)\frac{Z_x/m}{Z_y/n}\sim F(m,n). Let U=Zx/mZy/nU = \frac{Z_x/m}{Z_y/n}, then we have PDF fU(u)=Γ((m+n)/2)Γ(m/2)Γ(n/2)(mnu)m21(1+mnu)m+n2(mn)f_U(u)= \frac{\Gamma((m+n)/2)}{\Gamma(m/2)\Gamma(n/2)} (\frac{m}{n}u)^{\frac{m}{2} - 1}(1 + \frac{m}{n}u)^{-\frac{m+n}{2}}(\frac{m}{n}) for u>0u>0

  • UF(m,n)    1/UF(n,m)U\sim F(m,n)\implies 1/U \sim F(n,m)
  • UnF(m,n)    limnmUnX2(m)U_n \sim F(m,n)\implies \lim_{n\to \infty}mU_n \sim \mathcal{X}^2(m)
  • Utk    U2F(1,k)U\sim t_k\implies U^2\sim F(1,k)

Try to prove all of those sampling distribution use change of variable.

Compound Distribution

Let X1,,XnX_1,\ldots,X_n be i.i.d sequence of Random variables. Let nn be non-negative integer valued random variable independent of {Xi}\{X_i\}. Then S=i=1nXiS = \sum_{i=1}^n X_i is a random variable with compound distribution.

  • E[S]=E[n]E[X1]E[S] = E[n] E[X_1]
    • E[S]=E[i=1nXi]E[S] = E[\sum_{i=1}^n X_i] where E[Xi]=E[X1]<E[X_i] = E[X_1] < \infty and define a indicator function Ii=I1,,n(i)I_i = I_{1,\ldots, n}(i)
    • Then E[S]=E[i=1IiX1]=i=1E[Ii]E[X1]=E[X1]i=1E[Ii]=E[X1]E[i=1Ii]=E[S]=E[n]E[X1]E[S] = E[\sum_{i=1}^{\infty} I_i X_1] = \sum_{i=1}^{\infty} E[I_i] E[X_1] = E[X_1] \sum_{i=1}^{\infty} E[I_i] = E[X_1] E[\sum_{i=1}^{\infty} I_i] = E[S] = E[n] E[X_1] where nn and X1X_1 are independent.
  • mS(t)=rn(mXi(t))m_S(t) = r_n(m_{X_i}(t))
    • mS(t)=E[etS]=E[exp(ti=1nXi)]=E[E[exp(ti=1nXi)N]]=j=0P(n=j)E[exp(i=1nXit)N=j]=j=0P(n=j)E[exp(i=1nXit)]=j=0P(n=j)[mX1(t)]n=rn(mX1(t))m_S(t) = E[e^{tS}] = E[\exp(t\sum_{i=1}^n X_i)] = E[E[\exp(t\sum_{i=1}^n X_i) | N]] = \sum_{j = 0}^{\infty} P(n = j)E[\exp(\sum_{i=1}^n X_i t) | N = j] = \sum_{j = 0}^{\infty} P(n = j)E[\exp(\sum_{i=1}^n X_i t)] =\sum_{j = 0}^{\infty} P(n = j)[m_{X_1}(t)]^n = r_n(m_{X_1}(t)) where rn=j=0jP(n=j)r_n = \sum_{j = 0}^{\infty} j P(n = j)

Mixture Distribution

Let X1,,XnX_1,\ldots,X_n be sequence of Random variables with CDFs F1,,FnF_1,\ldots,F_n. Let p1,,pnp_1,\ldots,p_n be positive real numbers such that i=1npi=1\sum_{i=1}^n p_i = 1. Then G(x)=p1F1(x)++pnFn(x)G(x) = p_1 F_1(x) + \cdots + p_n F_n(x) is the CDF of the mixture distribution.

Is the random variable continuous or discrete?

e,g. 2.5.6 from E&R: Let X1Poisson(3)X_1 \sim \text{Poisson}(3) with cdf F1F_1. Let X2N(0,1)X_2 \sim N(0,1) with CDF F2F_2 where p1=1/5,p2=4/5p_1 = 1/5, p_2 = 4/5. Let Y be the random vairable with mixtrue distribution with CDF G(x)=F1(x)+4/5F2(x)G(x) = F_1(x) + 4/5 F_2(x).

  • If Y is continuous, P(Y=y)=0=FY(y)FY(y)=1/5F1(y)+4/5F2(y)1/5F1(y)4/5F2(y)=1/5(F1(y)F1(y))+4/5(F2(y)F2(y))=1/5P(X1=y)+0P(Y= y) = 0 = F_Y(y) - F_Y(y^-) = 1/5F_1(y) + 4/5F_2(y) - 1/5F_1(y^-) - 4/5F_2(y^-) = 1/5(F_1(y) - F_1(y^-)) + 4/5(F_2(y) - F_2(y^-)) = 1/5 P(X_1 = y) + 0 where X2X_2 is continuous$

    • P(X1=y)=3ye3y!P(X_1 = y) = \frac{3^ye^{-3}}{y!} for y0y \ge 0, then P(Y=y)={153ye3y!y00OthersP(Y = y) = \begin{cases} \frac{1}{5}\frac{3^ye^{-3}}{y!} & y \ge 0 \\ 0 & \text{Others} \end{cases} so that YY is not continuous.
  • If Y is discrete, y0P(Y=y)=y0153ye3y!=1/5\sum_{y\ge 0}^{\infty}P(Y = y) = \sum_{y\ge 0}^{\infty}\frac{1}{5} \frac{3^ye^{-3}}{y!} = 1/5 which is not equal to 1 so that Y is not discrete.